Speaker-independent phoneme recognition on TIMIT database using integrated time-delay neural networks (TDNNs)

نویسندگان

  • Nobuo Hataoka
  • Alexander H. Waibel
چکیده

School of Computer Science Camegie Mellon University Pittsburgh, PA 15213, U.S.A. This paper describes a new structure of Neural Networks ("s) for speaker-independent and context-independent phoneme recognition. This structure is based on the integration of Time-Delay Neural Networks (TDNN) which have several TDNNs separated according to the duration of phonemes. As a result, the proposed structure has the advantage that it deals with phonemes of varying duration more effectively. In the experimental evaluation of the proposed new structure, 16-English vowel recognition was performed using 5268 vowel tokens picked from 480 sentences spoken by 140 speakers (98 males and 42 females) on the TIMIT (TI-MIT) database. The number of training tokens and testing tokens was 4326 from 100 speakers (69 males and 3 1 females) and 942 from 40 speakers (29 males and 11 females), respectively. The result was a 60.5% recognition rate (around 70% for a collapsed 13vowel case), which was improved from 56% in the single TDNN structure, showing the effectiveness of the proposed new structure to use temporal information.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phoneme Probability Estimation with Dynamic Sparsely Connected Artificial Neural Networks

This paper presents new methods for training large neural networks for phoneme probability estimation. An architecture combining time-delay windows and recurrent connections is used to capture the important dynamic information of the speech signal. Because the number of connections in a fully connected recurrent network grows super-linear with the number of hidden units, schemes for sparse conn...

متن کامل

Automatic Continuous Speech Recognition with Rapid Speaker Adaptation for Human/machine Interaction

This thesis presents work in three main directions of the automatic speech recognition field. The work within two of these – dynamic decoding and hybrid HMM/ANN speech recognition – has resulted in a real-time speech recognition system, currently in use in the human/machine dialogue demonstration system WAXHOLM, developed at the department. The third direction is fast unsupervised speaker adapt...

متن کامل

Spotting Japanese CV-Syllables and Phonemes Using Time-Delay Neural Networks

Syllable or phoneme spotting if reliably achieved, provides a good solution to the spoken word andlor continuous speech recognition problem, . We previously showed tha t the Time-Delay Neural Network (TDNN) provided excellent recognition performance (98.6%) of the "BDG" consonant task. We would also like to extend the encouraging performance of TDNN to wordlcontinuous speech recognition. In thi...

متن کامل

Sparse connection and pruning in large dynamic artificial neural networks

This paper presents new methods for training large neural networks for phoneme probability estimation. A combination of the time-delay architecture and the recurrent network architecture is used to capture the important dynamic information of the speech signal. Motivated by the fact that the number of connections in fully connected recurrent networks grows super-linear with the number of hidden...

متن کامل

Novel Objective Function for Improved Phoneme Recognition Using Time-delay Neural Networks. Vii. Conclusion and Future Work Iv. Phoneme and Viseme Coding

In this paper we show how recognition perfor-mance in automated speech perception can be significantlyimproved by additional Lipreading, so called “speech-read-ing”. We show this on an extension of an existing state-of-the-art speech recognition system, a modular MS-TDNN. Theacoustic and visual speech data is preclassified in two sepa-rate front-end phoneme TDNNs and com...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1990